arXiv:1502.01533v2 [math.ST] 19 Oct 2016 


Non-equispaced B-spline wavelets 


Version with detailed proofs 


Maarten Jansen 
Universite libre de Bruxelles 
Departments of Mathematics and Computer Science 

October 20, 2016 


Abstract 

This paper has three main contributions. The first is the construction of 
wavelet transforms from B-spline scaling functions defined on a grid of non- 
equispaced knots. The new construction extends the equispaced, biorthog- 
onal, compactly supported Cohen-Daubechies-Feauveau wavelets. The new 
construction is based on the factorisation of wavelet transforms into lifting 
steps. The second and third contributions are new insights on how to use 
these and other wavelets in statistical applications. The second contribution 
is related to the bias of a wavelet representation. It is investigated how the 
fine scaling coefficients should be derived from the observations. In the con¬ 
text of equispaced data, it is common practice to simply take the observations 
as fine scale coefficients. It is argued in this paper that this is not acceptable 
for non-interpolating wavelets on non-equidistant data. Finally, the third con¬ 
tribution is the study of the variance in a non-orthogonal wavelet transform 
in a new framework, replacing the numerical condition as a measure for non¬ 
orthogonality. By controlling the variances of the reconstruction from the 
wavelet coefficients, the new framework allows us to design wavelet trans¬ 
forms on irregular point sets with a focus on their use for smoothing or other 
applications in statistics. 
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1 Introduction 


Ever since the early days of wavelet research, spline wavelets have enjoyed special 
attention in the community. Spline wavelets combine the benefits from a sparse 
multiscale approach using wavelets and the well known properties of splines, in¬ 
cluding the closed form expressions, the numerous recursion relations, the polyno¬ 
mial based r egularity of the basis functions dUnseri.119971) . 

Splines tide Boot] , 1200ill , formally defined by the recursion as in Q below, 
are piecewise polynomials of a certain degree, with continuity constraints in the 
knots that connect the polynomial pieces. The position of these knots and the de¬ 
grees of the polynomial pieces are key parameters in the numerous methods in 
computer aided geometric design and computer graphics that are based on splines. 
Splines are also a popular tool in numerical analysis, for instance in interpolation. 
Compared to full polynomial interpolation, spline interpolation is far less sensitive 
to numerical instabilities that lead to oscillations. The good numerical condition 
is linked to the fact that any spline function can be decomposed into a basis of 
compactly supported piecewise polynomials, so-called B-splines. In statistics, the 
coefficients of the B-spline decomposition can be estimated in a nonparametric re¬ 
gression context. The estimator typically minimizes th e res idual sum of squares, 
penalized by t he roughness of the regression curve ( Green and Silvermanl. 1994 : 


E ubank. 19991) - The spline wavelet smoothing, as discussed in Section [5] of this 


paper, can be considered as an extension of these smoothing splines towards spar¬ 
sity oriented penalties and corresponding nonlinear smoothing based on threshold¬ 
ing. While smoothing splines have their knots on the locations of the observations, 
P-splines (Ruppert et al.. 120031) allow a flexible choice of knots. An important ad¬ 
vantage of any spline, whether it be an interpolating, smoothing or P-spline, is that 
it is known by an explicit expression. The main merit of a closed-form expression 
is that all information about the smoothness of the function, typically expressed 
by the Lipschitz regularity, is readily available for use in smoothing algorithms. 
The spline wavelets on irregular knots constructed in this paper are splines, and 
thus share this benefit. This is in contrast to most other wavelets, especially those 
on irregular point sets. The smoothness of these wavel ets d epends on the limit of 
an infinitely iterated refinement or subdivision scheme (Daubechies et ah, 1999b[), 
which can be hard to analyse, even for straigh tforward refinement schemes such as 
for Deslauries-Dubuc i nterpolating wavelets ([ Deslaur iers and Dubud. 1 19871. Il989l : 


Donoho and Yu, 1999; Sweldens and Schroder, 


The combination of splines and wavelets is, however, not trivial. One of the 
problems is that splines do not provide a naturally orthogonal basis. On the other 
hand, orthogonality is much appreciated in wavelet theory, because wavelet trans¬ 
forms operate scale after scale. In the inverse transform, i.e., in the reconstruction 
of a function from its wavelet coefficients, this scale to scale process amounts to 
the refinement or subdivision, mentioned above. Although the smoothness of the 
reconstruction in a spline basis does not depend on this refinement scheme, other 
properties do. These properties include the numerical condition of the transform 
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as well as the bias and the variance in statistical estimation. The assumption of 
orthogonality facilitates the analysis of these properties throughout the_subdivi- 
sion scheme. Initial spline wavelet constructions were orthogonal (Battle, 1987 


L emarie . 119881) . The price to pay for the orthogonality was that the basis functions 


did not have a compact support, and related to this, that transformation matrices 
were full, not sparse, matrices. 

The condition of orthogonality can be relaxed if the basis functions within one 
scale are allowed to be non-orthogon a l, wh i le the wavelets at differen t scales are 
still kept orthogonal ( Chui and Wang . 1992 : Unser et al. . 1992 . 1993 1. This con¬ 
struction leads to a semi-orthogonal spline basis. Since the non-orthogonality oc¬ 
curs only within each scale, this has little impact on the asymptotic analysis of the 
refinement process. Moreover, semi-orthogonal wavelet bases can be constructed 
from non-orthogonal spline bases, such as B-splines. The B-splines and the re¬ 
sulting wavelets have compact support. As a consequence, the reconstruction of 
data from a decomposition in these bases uses a sparse matrix. Sparse matrices 
and compact supports lead to faster algorithms, but also contribute to better control 
over manipulations on the wavelet coefficients. A manipulation of a coefficient, 
such as a thresholding or shrinkage operation, has only a local effect. 

Unfortunately, the forward transform of observations into the semi-orthogonal 
wavelet basis still requires the application of a full, non-sparse, matrix. This is 
because in general the inverse of a sparse matrix is not sparse. Sparse inverses 
are possible in a wide range of fast wavelet transforms, but the combination of the 
semi-orthogonality and the spline properties cannot be obtained within a sparse for¬ 
ward transform. As a consequence, every fine scale observation has at least some 
contribution to each wavelet coefficient. It would be better if not only a wavelet co¬ 
efficient had local impact on the reconstruction of the observations but also, at the 
same time, the coefficient got its value from a limited number of observations. The 
latter, dual, form of compact support is poss ible in the framework of b i-orthogonal 
spline wavelets. The construction by ICohen, Daubechies. and Feauveaul (119921) 
of a multiresolution analysis starting from B-splines has led to non-orthogonal 
wavelets with sparse decomposition and reconstruction matrices. 

The Cohen-Daubechies-Feauveau wavelets are defined on equispaced knots. 
This is because the classical multiresolution theory starts from scaling bases whose 
basis functions are all dilations and translations of a single father function. This 
construction is not possible on irregular knots. B-splines, on the other hand, are 
easily defined on non-equispaced knots. This paper extends the construction by 
Cohen, Daubechies and Feauveau towards these non-equispaced B-splines. For 
the construction of wavelet t ransforms on non-equispaced knots, someti mes termed 
second generation wavelets (Daubec hies et alJ.ll999al : ISweldenslll998l) . this paper 
adopts the lifting scheme (iSweldensl. 119961) . The lifting scheme provides for every 
refinement operation in a wavelet transform a factorisation into simple steps within 
the scale of that refinement. The key contribution of this paper is to identify the 
lifting steps that are necessary in the factorisation of a B-spline refinement on non- 
equidistant knots. 
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Section [2] summarizes results from the literature that are necessary for the main 
contribution in Section [3TT1 First, Section [2~Tl dcfines the notion of multiscale grids. 
Then, Section 1231 gives a definition of B-splines together with their properties for 
further use. Section 12.31 proposes to use a refinement equation as a definition for 
B-splines. In order to fill in coefficients in the equation, it needs to be factored into 
elementary operations. The result is a slight generalisation of the well known fac¬ 
torisation of equidistant wavelet transforms (Daubcchics and Sweldens, 19981)- Fi¬ 
nally, Section [3731 further investigates the role of the factorisation in wavelet trans¬ 
forms. The main contribution in Section [3TT1 fills in the lifting steps that constitute 
a B-spline refinement. Section [3~4l gives an expression for all possible wavelets 
that fit within the B-spline refinement scheme. The first part of this paper, about 
the construction of non-equispaced B-spline wavelets is concluded by the short 
Section [331 on the non-decimated B-spline wavelet transform. 


Ot her work on lif t ing for spline or B-sp l ine wavelets, such as (LB ertra mt 12004 


Cheml 119991: iFahmvL 120081 : iLi et all 120051 : iPrestin and OuakL 120051: IXiang et al. 


20071) is situated on equidistant kn ots or is focused on specific cases, such as 


Powell-Sabin spline wavelets dYanraes et all 20041) . B-spline wavelets on non- 
equispaced knot s have been studied for specific applic ations and with particular 
lifting schemes ( Pan and Yao . 2009) : Lvche et al. . 2001 ). It should be noted that 
B-splines on non-equispaced knots are often termed non-uniform B-splines. This 
term is avoided in this paper, as in statistical sense, a uniform set of knots could 
be interpreted as a set of random, uniformly distributed knots, which are of course 
almost surely non-equidistant. 

The second part of the paper consists of the Sections [4] and [5] It concentrates 
on the use of the B-spline wavelets from the first part in statistics. In statistical 
applications, B-spline wavelets are used, for instance, in a soft threshold scheme 
for noise reduction. Given that a soft threshold comes from an t\ regularised least 
squares approximation of the input, this application is an example of a penalised 
spline method. The discussions in Sections [4] and [5] are quite general, and therefore 
applicable to other non-equispaced wavelets as well. 

The discussion in Section[4]is related to the bias in a B-spline wavelet smooth¬ 
ing. It investigates how to proceed from observations to fine scaling coefficients. 
It is argued that in a situation with non-equispaced data, this step should be taken 
with c are, in order to avoi d to commit what some authors call the “wavelet crime” 
(IStrang and Nguyen. |1996D . 

While Section [4] deals with bias, Section [5] is about the variance in a second 
generation wavelet transform. Assuming that the observations are independent, the 
variance of the transformed data is best understood if the transform is orthogonal. 
As the construction by Cohen, Daubechies and Feauveau, extended in this paper 
towards non-equidistant knots, has somehow less attention for orthogonality, this 
may cause major problems with estimators su f fering from large v ariance effects 


dYanraes et all 120021: 1 Van Aerschot et all 120061: iJansen et all 1200 9). Although the 


large variance is due to the transform being non-orthogonal, the classical numeri¬ 
cal condition number is not a satisfactory quantification of the statistical problem. 
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Therefore this paper proposes a multiscale variance propagation number, based on 
the singular values of the linear projection onto the coarse scale B-spline basis. 
From there, an alternative B-spline wavelet transform is developed. The alterna¬ 
tive is closely related to the Cohen-Daubechies-Feauveau construction, but it keeps 
the variance propagation under control. 


2 B-splines and multiresolution 

This section reviews definitions and well established results on B-splines, multi¬ 
scale representations and the lifting scheme. 

2.1 Multilevel grids 

Let K n = {xk\k = 0,... , n — 1} be a set of knots on which we will define B- 
splines. B-splines of order p are basis functions spanning all piecewise polynomials 
of degree p — 1 with continuous p — 2 derivatives in the knots. In this paper, 
the B-spline basis will be constructed through a process known as refinement or 
subdivision. For this process to work, we first have to define coarse scale versions 
of the grid of knots. We thus identify the input set of knots as the fine scale grid, 
formalised as = . 17 .. The index J refers to the highest or finest scale. From 
there, we define grids {x 3 j.\k = ()...., rij — 1 } at coarser scales j, where j = 
L, L + 1,..., J — 1, L being the lowest or coarsest scale. Obviously rij < n 
stands for the number of points at scale j. Denoting Ajj. = Xj^+i — Xj £, and 
A j = sup fc=0i r)) _2 A ; .fc, we call the grid at scale j regular if A 3 p does not 
depend on k, i.e., A jf. = A j. The focus in this paper lies on irregular grids. 

Definition 1 (multilevel grid) The sequence of grids constitutes a multilevel grid 
if the following conditions are met: 

1. The sequence rij is strictly increasing. 

2. There exist constants R £ R and 8 > 0 so that maximum gap at scale j is 
bounded as follows 

A j < RnjP. (1) 


Condition dTJ is a slightly stricter version of the definition adopted in (Daubechies et al 


200ll) . where in the context of binary or dyadic refinement, i.e., n, = 2 J , it is im¬ 


posed that Y%Ll Aj < 00 . The condition can be understood by considering a se¬ 
quence of functions Xj : [0,1] —>- 1R : rr 1 —^ Xj(u) for which Xj^ = xj(k/ (rij — 1)). 
The divided differences of Xj(u) in the knots k/(rij — 1) are then A / /.(n J — 1). The 
divided differences must not or at most very slowly converge to a locally infinite 
derivative, in order not to leave any coarse scale gaps in a grid at fine scale. 

A multilevel grid is nested if Xj + \j- € {xjj.\k = 0,..., rij — 1}. In particular', 
the multilevel grid is two-nested if at each level, the grid is a binary refinement 
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of the previous, coarser level, that is, if Xj+i^k = x j,k- This paper works with 
two-nested multilevel grids only. 


2.2 B-splines at a fixed scale 


Throughout this paper, ip v p k (x) will stand for the B-spline of order p defined on 
the knots Xj^. There exist several recursion expressions for the_construction of 
B-spli nes. This paper will use the following formula (Ou and Gregory, 1992; 


Daubechies et ah. 120011 . page 497) as definition. 


Definition 2 (B-splines) The B-splines of order 1 defined on die knots x hl are 
the characteristic functions (pfki x ) = Xj,k( x )> w ^ iere Xj,k( x ) = 1 ^ x € 
[xjk,Xjk- |-i) and Xj,k(x) = 0 otherwise. B-splines of order 1 are also known 
as B-splines of degree 0. 

B-splines of order p, i.e., degree p — 1 ,forp > 0, are defined recursively as 



+ 


X - X j,k- [p/ 2 ] 

X j,k+\f/ 2 ]-l ~ X j,k-[f/2\ 
X j,k+\f/2-} ~ X 
X j,k+\f/2] - x j,k-[p/2\+l 


^ J _ ( x ) 
^ j,k-l+rem(p/2) K ' > 

~ (x). 

^ j,k+rem(p/2 ) ^ ^ 


( 2 ) 


hi this equation rem (p/q) = p — q \_p/q\ denotes the remainder from an integer 
division. 


Later on in this paper, the construction through recursion will be replaced by a con¬ 
struction through refinement. On a finite set of knots, i.e., when k <E {(),..., n 3 — 
1 }, both constructions are equivalent if we follow the convention in © that the left 
and right end points are multiple knots. More precisely, whenever a knot index in 
© is outside { 0 ,... , rij — 1 }, then we take xjj = Xj t o for l < 0 and x 3 - r = Xj iUj -1 
for r > rij — 1. Definition [2] associates with every knot x 3 ± a B-spline function 
\p\ 

tfj [.(x). The function turns out to be centered around the corresponding knot, as 
can be seen from the following result. 

Theorem 1 (piecewise polynomials on bounded intervals) For k € { [p/ 2j ,..., n— 

ra 

1 — [p/2 ]} the function ip 1 - /(x) is zero outside the interval [Xj t i h , xj ^ k ), where 

ra 

h = k — \p/ 2j and r k = k + [p/2]. Inside this interval, pr- k {x) is a polynomial 
of degree p — 1 between two knots Xj± and x ]k+ \, while in the knots, the function 
and its first p — 2 derivatives are continuous. 

The proof follows by induction, using Definition |2] 

FI 

Theorem [T| should be amended for functions p- / near the boundaries, i.e., 
for k close to 0 or n 3 — 1. The specification is postponed to the moment where 
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the functions have been redefined using refinement instead of recursion. We first 
concentrate on the interior interval, defined by 


\v\ 

Ij = {x € [x jt0 ,x jinj -i)\yk = 0,.. 1 : y) k {x) ^ 0 => l k > 0 and r k < nj-1}, 

(3) 

with l k and r k as defined in Theorem Q] It is straightforward to check that 


h = 


X j,p— I ’ X j,rij—p 


(4) 


From Theorem Q3 it is obvious that the set of B-splines | ^pfi, | generates piece- 

wise polynomials. Conversely, it can be verified that any piecewise polynomial on 
the interior interval Ij can be decomposed as a linear combination of B-splines. 

Theorem 2 ( B-spline basis) Let fj(x) be a function which is polynomial on each 
interval [xj^, Xj }k +i)for which {xj tk , Xj^+ 1 } C Ij and which has p —2 continuous 
derivatives in the knots x 3 ± € Ij. Then there exist constants a. j j. so that for all 
x € Ij, 

rij—l— \p/I\ 

fj( x ) = J2 a 3,kV>fk( x ) (5) 

fc=L?/2j 


Proof. See Appendix IB. II 

P 

Remark 1 Most references in literature would adopt the symbol iVj (;/;) for a 
shifted version of this basis, namely 


S^ = N fl- L?/ 2 j (x) - 


The notation with iVj ^ (x) leads to more elegant expressions for the recursion of 
B-splines, but it does not correspond to the common practice in wavelet literature 
where a basis function ipj jk {x) is centered around the point Xj j fc. 

The forthcoming discussions will use expressions for the derivatives of spline 
functions. 


Lemma 1 The derivative of a B-spline is given by 

H 


d [?]/ n 

J (x ) = (p_!) 


^j,k— l+rem(p/2) 


Cj,fc+rem(p/2)' 


X- 


j,k+\p/ 2 ]-l X j,k-[p/2\ X j,k+\p/ 2 ] X j,k- [p/ 2 j +1 


( 6 ) 
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Lemma Q] can be proven by induction on p. The computations are facilitated by 

ra 

working on the shifted index in iV - f (x). 

As a consequence of Lemma[t] a linear combination of the B-splines 


f ( \ [p] H / \ 

fi( x ) = E s jMM x ) 


k&7L 


has a derivative equal to 


f't ^ t~ i'i S j> k S j,k-1 [ p _1 ] l \ 

fji x ) = (p - 1 ) 2 ^----- <p 

fees X i,fc+[p/2]-l X j,fc-Lp/2j 3) 


(V) 


( 8 ) 


where r' = 1 — rem(p/2). 

In the search for a refinement relation for B-splines, an important role will be 
played by the decomposition of the power functions x q of degrees q = 0,... ,p — 

1. For q = 0, Definition [2] allows us to conclude that the basis functions are 
normalised so that 


Vs £ I 


3 ' 


<pfl ( x ) = 


(9) 


fc =0 


for any order p. This property is referred to as the partition of unity. For general 
q < p, the expansion of x q in a B-spline basis can be established according to the 
following result, which is closely related to Marsden’s identity (Lee, .1996). 


Theorem 3 (power coefficients) For q = 0,1,... ,p — 1, there exist coefficients 
xfk \ so that for x € Ij, 


E ~[p,g\ [p] / \ 

X 3,k 


( 10 ) 


fees 


1. For q = 0, these coefficients are one, according to the partition of unity. 

2. For q = 1,... ,p — 1, the coefficients can be found by the recursion 


xM _ =[H , 9 [p-i^-i] 

1 p _ ^ j,k—r' 


X -, J = x; , L + . p 1J (Xj,k+W2\-1 - x j,k- [p/2\) > (H) 


where r' = 1 — rem(p/2), as in ©. 

3. 7/i particular, for q = 1, these coefficients satisfy 

r?/ 2 ] -i 


_[p,t] _ i 

p_i 


E 

i=l-[p/2j 




( 12 ) 








4. For q = p — 1, this becomes 


-[p.p— i] 


v j,k 


m-i 

n 

i=l-[p/2j 




(13) 


Proof. See Appendix IB. 21 


Remark 2 Section [7771 will be based on a general reading o/( m which describes 

the transition from x^' k _ i to ■ Since the results in Theorem [3] do not depend 
on the ordering of the knots, the same formula as in Ml D can also be used to 

recompute the coefficients , when one knot is taken out from the grid and 
replaced by another. In ([77} the knot x- k _ ^ 2 j is replaced by x- _ v while 

the factor ~ ^ depends only on knots that are left untouched. 

Expression m is an example of a formula that is simplified by working in the 
[pi -~[p q\ [p q 1 

shifted basis N!jf(x). Putting tj : k = x 'j,fc+[p/ 2 J ’ and f.k = k+[f/ 2 \ ’ we get 


iM _ 7 M Q fp-Tq-f ( _ 

l j,k l j,k-l ~ _ j T j,k [fj,k+p-1 



(14) 


The following theorem states that no other basis reproduces polynomials with 
functions that have a support between p + 1 knots. 

Theorem 4 (uniqueness by power coefficients) Let x.jj,. for k = 0, ... ,rij — 1 be 

ra 

the knots Xjy. at level j, and let k (x) be a set of basis functions associated to 

ra 

these knots. If the support of ip l - k (x) equals Sj^ = [xj,i k ,Xj jrk ), with 4 and r^ 

as in Theorem [7] and if the coefficients x in the decompositions d70D . for all 
x € Ij and for all q £ {0.1,... ,p — 1}, are given by the values in Theorem\3\ then 
\p\ 

Pj k (x) must be the B-splines of order p defined on the given knots. 

Proof. See Appendix IB. 3 1 Theorem [4] motivates the use of the power function 

coefficients as a defining property in the design of a refinement scheme for 
B-splines. 


2.3 B-spline refinement schemes 


In this section, we consider B-spline functions at different scales, i.e., with 
different indic es j. The first r e sult s t ates that a construction th rough refinement 


must exist, see lOu and Gregorvl (ll992lf : lDaubechies et al.l d200ll. (26), page 497). 
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Theorem 5 (existence ofB-spline refinement) On a nested multilevel grid, B-spline 

ra 

basis functions at scale j are refinable, i.e., there exists a refinement matrix H - 
so that, for all x € Ij, 


, sr tt\p\ H 
Vj,k( x ) = z. H jJ,k<Pjii/ x )- 
e=o 


(15) 


Proof. This is an immediate consequence of Theorems [Q and [2j A B-spline on 
a grid at level j is a piecewise polynomial with knots in Xj^. Since these knots are 
also knots at scale j + 1, the B-spline is also a piecewise polynomial at scale j + 1, 
and so it can be written as a linear combination of the B-spline basis at that scale. 

□ 

Equation (fT5T) is an instance of a two-scale equation, also known as a refinement 
equation. The general form of the refinement equation, without superscripts for the 
order of the B-splines, is 


n j +1 

Vj,k(x) = H iAWj+i/( x )- ( 16 ) 

i =o 

The refinement equation can also be condensed into a matrix form 


®j{x) = ®j+i{x)Hj, ( 17 ) 

where 

( x ) = [<Pj,o{x) <Pj,i(x) ■ ■ ■ l Pj,n j -i(x)] ( 18 ) 

is a row of scaling basis functions. 

In the first instance, a refinement equation should be read as the definition of 
the scaling functions from the refinement matrices Hj. A numerical solution of the 
equation thus allows us to evaluate the scaling functions, in particular the B-spline 
functions. Secondly, the refinement equation will be the basis for the construction 
ofB-spline wavelets, as explained in Section [TT] This motivates the search for the 

[pl 

spline refinement matrix H - 1 . 

A refinement matrix Hj is often band-limited, as follows from the next lemma, 
valid for general refinable scaling functions. 

Lemma 2 ( band-limited refinement matrices) Let <pj k(x), for j = 0,... ,rij — 1 
be a set of scaling functions, with refinement equation ( I/6D . Let Sj k denote the 
support of (pj t k(x), then an entry Hj^ ma )’ be different from zero only if Sj + \y C 
Sj,k- 

Proof. Suppose H^f j. would be nonzero for a fine scaling function outside the 
support of the coarse scaling function, then obviously, that coarse scaling function 
would take a nonzero value outside its support. □ 

For the B-spline basis, Theorem [2] translates as follows. 
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ra 

Corollary 1 The columns of the matrix H j in rt/51 ) can have at most p+1 nonzero 
elements. In particular 

H fi,k + 0 => 2k - [p/2\ <t<2 k+ [p/2] . (19) 

\p\ 

Proof. The support of a B-splme is S jtk = [x jk _]^ /2 ^x jk+ ^~/ 2 -\]. We 

have 

S j+l,t C S j)k [x j+M - [~/ 2 J , X j+lt z+ fp/ 2 ] ] C i x j,k- [p/ 2 j > X j,k+ [p/ 2 ] ] 

^ [p/2j ’ X j+1/+ [p/2] ] C [ X j+l,2fc-2 [p/2j ’ X j+l,2k+2 [p/2] ] 

it- [p/2\ >2k-2 [p/2\ and 
^ \ i + [p/2] < 2k + 2 [p/2] 

4^ £e{2k-[p/2\,...,2k+\p/2-]}. 

As #{2A: — [p/2j ,..., 2k + [p/2] } = [p/2] + [p/2j + 1 = p + 1, the number 
of nonzero elements in column k is bounded by p + 1 . □ 

The maximum number of nonzero elements in each column is known as the band¬ 
width of the matrix, see Definition Obelow. 

2.4 Factorisation of the refinement matrix 

[pi 

The objective is of course to identify the nonzero entries of Hj . The strategy 
followed in this paper is based on a factorisation that can be applied to any band- 
limited refinement matrix Hj. The factorisation starts from a partition of the rows 
of the matrix into an even and an odd subset, leading to submatrices Hj e and H J O 
and so that the refinement equation (fT71) can be written as 

j , e {x)H j e -f- Q j+\ 0 (x)H j Q . (20) 


Remark 3 The matrix Hj e is a squared matrix, because in the nested refinement, 
the even subset of knots at scale j + 1 are exactly the knots at scale j. 


These submatrices can then be factored in an iterative way, alternating between 
two sorts of factorisation steps. The alternation is the matrix equivalent of Euclid’s 
algorithm for finding the greatest common divider ( Daubechies a nd Swelden s J 19981) . 
The superscript [s] in the theorem refers to the factorisation step. It should not be 
confused with the superscript [p] referring to the order of a B-spline. In Section 
13.11 both s and p will appear in single superscripts. The subsequent result uses the 
following definition for bandwidth of a rectangular matrix. 


Definition 3 (bandwidth of a refinement matrix) Let Abe an mx n matrix, where 
in each column j = 1 ,,n, there exists a row i\ (j) so that 


Aij ± 0 h(j) < i < h(j) + 6 — 1 , 
with b independent from j, then the bandwidth of this matrix is b. 


( 21 ) 
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Theorem 6 (factorisation into lifting steps) Given a refinement matrix H - and 

r s i r s i 

the submatrices r and /r 0 containing its even and odd rows respectively. If 

M [ S 1 

H ■ g has a larger bandwidth than H ■ ^ then we can always find a lower bidiagonal 
matrix C/j s+1 ^ and a matrix Hj S ^ with smaller bandwidth than that of so that 


zA s ] _ td s +l] 

H j,e ~ H j,e 


tt[s+1] tt[s] 
U 3 3,°’ 


( 22 ) 


fsl [si 

If Hj ' 0 has a larger bandwidth than Hj f , then we can always find an upper bidiag¬ 
onal matrix p\ s 1 and a matrix H\ s f^ 1 with smaller bandwidth than that of h\ s \ 

J 3)° J 3) e 

so that 


ul s ] _ uA+l] _i_ p[ s +l] ztH 

H 3,o ~ H j,o + F j H j,e’ 


(23) 


If Hj ^ anc l Hjl h ave the same bandwidth, then both d22D and f ! 22 D are possible. 


Proof. See Appendix [C] 

The matrices p\ s+1 ^ are known as dual lifting steps or prediction st eps, where 
both terms refer to an interpretation beyond the scope of this paper (Sweldens. 


19981) . As a matter of fact, the interpretation as a prediction is even not applicable 


in the context of this paper. The matrices Uj S+1 ^ are primal lifting steps or update 
steps. 

The next sections explain how the factorisation into lifting steps can be used in 
the design of a multiscale decomposition of a B-spline basis on irregular knots. 


3 Non-equispaced B-spline wavelet transforms 

3.1 Main contribution: the design of B-spline lifting steps 

Theorem fallows us to develop a lifting scheme for B-splines resting on an analysis 
of power function coefficients only. We will impose that if Sj+\.k = fc , 

then dj t k must be zero, while ■ In principle, the development of this 

\p\ 

condition can proceed directly on the band matrix Hj , without having to use the 
lifting factorisation. Solving the corresponding linear system seems, however, not 
to yield a simple formula. 

[pi 

The lifting factorisation of H- is found in a relatively straightforward way, 
because all lifting steps are essentially the same sort of operation, described in 
the following proposition. The successive lifting steps differ from each other only 
in the subsets of knots that are involved in the operation. As a consequence, we 
formulate the operation for an arbitrary subset of knots. The elements of the subsets 
are denoted by t J/t . Obviously, all tj i coincide with a knot from the Xk, but the t J3 
may be unordered and the subset may contain coinciding values tj# = t J3 . The 
proposition constructs a recursion (fldl) on the selected knots t 3% . The recursion 
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(fl4l) has originally been stated for t J)t that are shifted sorted knots, but nothing 
prevents us from using it for more general sets of tjj. 

Proposition 1 (lifting step for B-spline power coefficients) Given an order p, let 

—PjQd 

Kj^k = {tj,ki • • • 1 1 j'k+p} an unsorted set of knots. Define the coefficient tj j, 
by the recursion in 1741) . using the knots {tj^+i, ■ ■ ■, tj k+ ~_f\. Define the coeffi¬ 
cients and i^k+i ^ t ^ le same recursion, but using { tj ..., t ■ k+ ~_ 2 } and 

{tj,k+ 2 , ■ ■ ■, tj,k+p} respectively. 

Define the lifting parameters 


L 


j,k,k—l 


t 


j,k+p j,k+p— 1 

tj,k+p — 


and 1 


tj,k+1 tj,k 

tj,k+p ~ tj.k 


(24) 


Then the lifted coefficient equals, 

for all values of q = 0 ,..., p — 1, 


l j,k 


,k+p-1 tj,k+1 j{p,g,Llb] 

J- 




(25) 


where t 


dp,q,Lib] 


j,k 


is the coefficient defined by the recursion in m using the knots 


{tj.ki tj,k+2i ■■■■, tj tk+p _ 2 , tj k+p \- 


In other words, a lifting step with only two parameters, has a common effect on all p 
power coefficients: it takes out the knots tj jk +1 and t- k+ ~_ 1 to replace them by tj k 
and tj k+ ~. The lifting step has thus a similar effect as the recursive formula (fl4l) . 
The difference between lifting and recursion is that the recursion uses coefficients 
of different power functions in B-splines of different degree, while lifting is based 
on coefficients of a single function in a single basis. Moreover, the lifting formula 
is the same for all power functions in that basis. 

The proof of this theorem is based on the recursion of (fl4l) . As it is rather 
technical, it can be found in Appendix iDl 

The lifting operation presented in Theorem Q] lies at the heart of the linear 

transform that maps fine scale power coefficients k onto coarse scale ver¬ 
sions Xj’if 1 ' along with zero detail coefficients . We adopt the symbol in 

~\p,q\ 

contrast to tj k to emphasise that we are working now on the sorted knots xj )k . 

The update lifting steps C/j take care of coarse scaling coefficients, starting from 
the even fine scale coefficients. Each update step takes out two odd indexed, fine 
scale knots aSj+i : 2 fc±( 2 m+i) from the intermediate scaling coefficients and adds two 
coarse scale knots Xj tk ±t = Xj+\, 2 k±' 2 t that are outside the fine scale range. The 
indices m and t depend on the lifting step s, as developed in Proposition [2] In 
a similar way, the prediction lifting steps P- 1 take care of the detail coefficients, 
operating on the odd fine scaling coefficients. A prediction step takes out two odd 
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indexed, fine scale knots Xj+i, 2 fc± 2 m+i from the definition of the intermediate co¬ 
efficient, replacing it by two new coarse scale knots. 

The final prediction step has a special role. It is supposed to take out the last 
remaining odd knot twice. That is, m = 0 , so that £j+i,2fc+2m+i = x j+i,2k-2m+i- 
In terms of the unsorted knots tj^ in Proposition Q3 the knots are numbered so that 
tj k + p_i = tj,k+ 1 - The outcome of the lifted power coefficient in dT]) is then zero, 
as requested. The preceeding prediction steps should therefore be such that one odd 
knot is left over for the final step. Depending on the number of odd knots in the 
beginning, this may imply that the first prediction step takes out only one odd knot. 
In that case, the first prediction step is a diagonal instead of a bidiagonal matrix. 
Whether a prediction matrix has one or two nonzero (off-)diagonals is controlled 
by the variable t, defined in Proposition [2] Similar considerations hold for the 
update steps, for which the variable u controls the number of nonzero diagonals. 
All together, we arrive at the following lifting scheme for B-splines. 


Proposition 2 ( Main result: lifting scheme for B-splines) Suppose that + are 
scaling coefficients at scale j + 1 defined on the knots Xj + i Then consider a 
lifting scheme with u update steps and u-\-r prediction steps, where integer u and 
boolean value r are given by u = [_(P + 1)/4J , and r = [p/2] — 2 u,for a given 
integer p. Furthermore, let r = p — 2\_p/2\ be a boolean indicating the parity of 
p. For every s G {1 — r,..., u\, define the values m = u — s, t = r +2s — 1, 
u = r ■ min(r + s, 2), t = r ■ min(l + s, 2). 

Then construct the following lifting scheme. First, define the even and odd 
factors, for s £ { 1 ,..., u }, and for s G {1 — r,... , u}, respectively, 


C j+l,k 

c [s] 

C j+l,2k 

>+ s ] 

c j+l,2k+l 


>-l] 

L j+1,2k 

[r+s-1] 

c j+l,2k+l 


%j+l,2k+2m+l %j-\-l,2k—2m—l 

x j+l,2k+2m+2t ~ x j+l,2k-2m-2t+u 
3?7+l,2/c+2m+l %j-\-l,2k— 2ra+l 

x j+l,2k+2m+2t+2 ~ x j + i 2k-2m-2t+t 


(26) 

(27) 

(28) 


Second, for s € {1,..., u}, define an update matrix Vas a lower bidiagonal 
matrix with entries 


U. 


j,k,k 


u. 


w 


j,k,k -1 


C j+l,2k 

[r+s-1] 

L j+l,2k+l 

[r+s-1] 

L j+l,2k-l 


x j+l,2k—2m—l x j+l,2k-2m-2t+u 
x j+l,2k+2m+2t ~ x j+l,2k-2m-2t+u 

x j+l,2k+2m+2t x j+l,2k+2m+l 
x j+l,2k+2m+2t ~ x j+l } 2k-2m-2t+u 


(29) 

(30) 


Set the corresponding lifted even coefficients 


[s] _ [s—1] r7 -[s] [r+s—1] . TT [s] [r+s—1] 

S j+l,2k ~ S j+l,2k + ^j,k,k-l S j+l,2k-l + ^j,k,k s j+l,2k+V 


(31) 


14 








Third, for s € {1 — r,..., u), define a prediction matrix p\ r+s ^ as an upper bidi¬ 
agonal matrix with entries 


Jr+s-1] 


pb+s] _ 
r j,k,k 

C j+l,2fc+l 

Xj+l,2k+2m+2t+2 

•£j+l,2fe+2m+l 

(32) 

c [s] 

C j+t,2k 

Xj+l,2k+2m+2t+2 ~ 

” X j+l,2k-2m-2t+t 

ph+ s l _ 

r j,k,k +1 

[r+s-1] 

L j+l,2k+l 

3^7+1,2/c—2ra+l — 

^ j+ l,2fc— 2m—2t+t 

(33) 

c [s] 

c j+l,2k+2 

Xj+l,2k+2m+2t+2 ~ 

~ X j+l,2k-2m-2t+t 


Set the corresponding lifted odd coefficients 


>+s] _ [r+s-l] _ p [r+s] [s] _ p [r+s] [s] 

b j+l,2k+l ~ *j+l,2fc+l j,k,k b j+l,2k r j,k,k+l b j+l,2k+2' 'Tv 


Finally, define the diagonal rescaling matrix Dj as 

t~\ _ [it] 

U i,k,k — c j+l,2ki 

and the scaling and detail coefficients at scale j as 


Si k — i,S 


tdj,k — 


j,k,k° j+l,2k 

[r+u] 

b j+l,2k+V 


(35) 

(36) 

(37) 


Then, the power coefficients in a B-spline basis at scale j +1, defined in Theorem 0 
and denoted as s, J+ \y = x^-j k , are transformed by this lifting schemes into coarse 


scaling coefficients Sjj,,. = phis detail coefficients dj j. = 0. Consequently, 
by Theorem 0 the refinement equation associated to this lifting scheme has the 
B-splines of order p on the non-equidistant knots x -jj : as its solution. 


Remark 4 The lifting scheme for B-splines should thus be understood as a gradual 
coarsening of fine scale representation of polynomials. This is unlike some other 
lifting constructions, where lifting steps are designed as a way to improve, i.e., 
“lift higher” existing wavelet transforms, by gradually adding more properties. In 
the B-spline case, one could for instance, try to lift a linear B-spline into a cubic 
B-spline. Such a construction is, however, impossible with a single lifting step. 


3.2 Examples of B-spline lifting schemes 

This section develops concrete examples of Proposition [2] 


3.2.1 The Haar scaling functions 

The simplest case of a B-spline basis is that of B-splines of order one, i.e., degree 
zero. The basis functions are characteristic functions on the intervals between two 
knots. This is the Haar scaling basis defined on these knots. 
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As p = 1, we find u = 0, meaning that the lifting scheme has zero update 
steps. There will be one prediction step because r = 1. This single prediction step 
is defined by (l32l) and (1331) with indices s = 0, t = 0, m = 0, and r + s — 1 = 0. 
From (1261) . it follows that all factors f are equal to one in this prediction. We 
also find r = 1 , and so, t = 1 , meaning that the prediction will be a diagonal 
matrix. This is confirmed by substitution of all the indices in (l32l) and (l33l) . We 


= 1 . 


find that Pf} k , k = 1 and pj ^ fe+1 = 0. We also find that D J;k , k = c^_ 12fe 

This lifting scheme defines the refinement equation and hence the Flaar scaling 
functions, in a way developed in Section l3~3l It does not yet fix the Haar wavelet 
basis. There are several options for these basis functions, including the classical 
Haar ba sis U,- k (x) = i 2 /,_\ (x ) — pj + ]. 2 k(x), but also the Unbalanced Haar 
basis (Girardi and Sweldens, 19971 1. Each option can be realised by one additional 
update step, as explained in Section [3~4l 


3.2.2 The linear B-spline scaling functions 

Linear splines reproduce constant and linear functions, hence p = 2. We find that 
u = 0 , so the lifting scheme for the refinement equation has again no update step. 
As before, there is one prediction step, and also the indices s = 0, t = 0, m = 0, 
and r + s — 1 = 0 remain the same as in the Haar case, again leading to the conclu¬ 
sion that the factors cj^J , f are equal to one. In contrast to the Haar case, we now 
have r = 0, and from there, t = 0. The effect of this is that the prediction is now a 
bidiagonal matrix, with elements P^j k = (■x j+1>2 k +2 ~ x j+1}2 k+i)/{xj+i, 2 k +2 ~ 
Tj+i, 2 fc) ttnd Pj k k _)_i (■t'j+i, 2 fc+i ■Tj-t-i, 2 fc)/(‘Fj+i, 2 fc +2 Tj+i^fc)- find 
again D jtk}k = cf[ 12k = 1. 

This matrix can be interpreted as a linear interpolation in the odd covariate val¬ 
ues. Therefore, the constant and linear splines have refinement schemes consisting 
of, respectively, constant and linear polynomial interpolation in a single predic¬ 
tion step. Higher order splines cannot be associated with higher order polynomial 
interpolation, as illustrated by the next example. 

3.2.3 The cubic B-spline scaling functions 

For cubic B-splines, we set p = 4, from which it follows that u = 1, meaning 
that we have one update step. There will be one prediction step, as r = 0, so 
the lifting scheme starts with the update. The index s £ {1 — r,..., u} = { 1 } 
takes only one value, and so do the indices t = 1 , m = 0, and r + s — 1 = 
0. Furthermore r = 0, from which it follows that t = 0 = u, meaning that 
both the update and the prediction are bidiagonal matrices. For the update, we 

find u fl,k = -(Xj+I,2k-1 - x j+ i : 2 k -2)/(xj+i,2k+2 - Xj+ i, 2 fe—2) and ^ = 
-(z.j+i,2fc+2 - x j+ i t 2k+i)/(Xj+i, 2k +2 - Xj+i,2k-2)- The subsequent prediction is 

given by pjjj.k = (x j+li 2 k+i-Xj + i, 2 *+i)/(si+i, 2 fc+ 4 -®j+i,2^-2) and Pfl tk+l = 
C Xj+i , 2k+1 - x j+h 2k-2)/{x j+l! 2k+4 - Xj+I,2k- 2)- The diagonal elements of the 
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final rescaling become D hk , k = c^ 12fc = (xj + i, 2 k+i-Xj+i, 2 k-i)/(xj+i, 2 k +2 ~ 
x j+l,2k-2)- 

3.3 From the factorisation to a multiscale transform 

The two-scale equation (fT71) containing the matrix Hj in Section [23] concentrated 
on the refinement of coarse scale functions fj(x) or coarse scale basis functions 
&j(x). The argument used for the design of Hj in Proposition [2] went the other 
way, starting from the finer scale j + 1, and imposing that fine scale power co¬ 
efficients are projected onto coarse scale power projections. This section will re¬ 
assemble the matrix Hj from the lifting factorisation that resulted from Proposi¬ 
tion [2] In the first instance, the argument runs again from fine to coarse scale. Let 
fj + i(x) = $>j + i(x)sj+i, then this function can be projected onto the basis 4>j(x). 
Even if power coefficients are projected onto power coefficients, the projection is 
not unique, orthogonal projection being just one of the possibilities. The reassem¬ 
bly of the refinement matrix Hj from its factorisation, discussed in this section, 
induces one particular projection. Section [3~4l will explain how to realise any other 
projection using one further lifting step. 

A projection onto is characterised by a complementary basis T. ; (.x), 

termed the wavelet basis, for which 


$j + i(x)aj + i = 3>j(x)sj + ^>j(x)dj. (38) 

The expression (1381) is a basis transformation that can be interpreted in two direc¬ 
tions. From left to right, it represents the projection, where the actual calculation 
of Sj and dj still has to be developed. In the other direction it describes the recon¬ 
struction of the fine scale data from the coarse projection Qj(x)s,j and the residual 
^ j(x)dj. The reconstruction includes the refinement given in the equation (fT71) . 
Indeed, take for the vectors s 3 and dj a column of the matrices I n . and O nj , the 
identity and zero matrices of size rij x n 3 . Then the the vector .s ?+ 1 must be the 
corresponding column of Hj. In a similar way one can take for Sj a zero column 
and for dj a canonical vector. The vector s J+ i is then the column of the matrix Gj 
in the wavelet equation 

^j(x) = $ j+ i(x)Gj. (39) 

Substitution of the two-scale and wavelet equations (fT7T) and (l39l) into (l38l) amounts 
to 

= HjSj T Gjdj. (40) 

The projection of fj + \{x) is found from the inverse of (l40l) . This inverse can be 
represented by the matrices 

Sj = Hj s j+ 1 , (41) 

dj = Gjsj+i. (42) 


17 


The matrices Hj and Gj can be found from the inversion of (l40l) . which is formu¬ 
lated as the perfect reconstruction property 

~ tT + GjGj = In 


HiHj 


n 3+ 1 ' 


(43) 


The residual coefficients dj are known as detail or wavelet coefficients at scale 
j. The coarse scaling coefficients Sj can further be processed into detail and scaling 
coefficients at scale j — 1 and so on. The multiscale transform from coefficients 
sj at finest scale J into details at successive scales and scaling coefficients at a 
final, coarse scale sl is the forward wavelet transform or wavelet analysis. It is 
carried out by repeated application of (l4lT) and (1421) . Likewise, (l40l) is one step in 
the inverse wavelet transform or wavelet synthesis. 

The forward transform matrix is denoted as W. It maps the fine scaling vector 
sj onto the vector of coarse scaling coefficients and multiscale details. Denot¬ 


ing the latter vector as wl in the definition w[ = 
forward wavelet transform is formalized as 


dl 


d T j -i 


the 


wl = Wsj. (44) 

The inverse wavelet transform matrix is denoted by W = W^ 1 . 

The following theorem states that the lifting factorisation of a refinement ma¬ 
trix Hj can be used to find in a straightforward way a matrix G^, so that Hj and 
G^ constitute one step of an inverse wavelet transform. Moreover, the forward 
transform matrices II-' 1 and Gj follow immediately as well, i.e., without any ex¬ 
plicit non-diagonal matrix inversion. The superscripts in G^ and tlj' 1 refer to 
the fact that these matrices follow in a natural way from the lifting factorisation 
of Hj. Other matrices Gj and Hj may appear in a perfect reconstruction scheme 
(1431) with Hj as well. Theorem [8] will provide a lifting construction for all possible 
Gj and Hj, given a refinement matrix Hj. This construction does not affect Gj, 
which is the reason for not writing a superscript in that matrix. 

r s | 

Theorem 7 ( wavelet transform from lifting factorisation) If we can write Hj e = 
Hj S +^—Uj S+1 ^Hj S 0 as in 1221) . with H^ 1 ^ = Dj an invertible matrix, then we can 
construct a wavelet transform containing Hj as refinement matrix. In particular, 


we let Pj b+ ^ = H^ 0 Dj 1 , and take as forward transform 

Sj = D j l (Sjj r i te + Uf +1] Sj + i j0 ), 

(45) 

dj = s j+i,o~Pj ^ Dj■ 

(46) 

The synthesis or inverse transform consists of 

s j+\,o — dj + Pj ^ Dj Sj , 

(47) 

T~\ T-7-[S-|-l] 

&j-\-l,e — Uj Sj Uj Sj-\-l, 0 ' 

(48) 
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Is] 

Proof. From the inverse transform, we can check that H- ' 

Sj+i, e = DjSj—Uj S+1 \dj+Pj S+1 ^DjSj), meaning that H^ e 


^-[s+l] _ £/-[ s +l]^[ s ] 

The factorisation behind the transform is thus 


_ p\^ v D 3 , while 

_ n _r7-[ s +l] p[ s +l] pi _ 

- u 3 Uj r-j u 3 - 


’ H [s] ' 

3,e 

II 

-1 

1 

Elf 

_1 

1- 


T —TP 

ln i U 3 


s+1] 


Inj 0 

| 3 + 1 ] 


P 


in ' 


1- 

1 

b 

-1 

-1 

O 

_i 


where n' = rij + \ — n 3 . 

In principle, Theorem |7] is applicable to any factorisation (1221) . In practice, it 
becomes interesting as soon as Dj is a diagonal matrix. Then the band structure 
for both forward and inverse transform is under immediate control, as inverting 
the transform requires no matrix inversion, except for the trivial case of a diagonal 
matrix. 

Let Hj be a general refinement matrix, so that H ie has a larger bandwidth than 
Hj.o, then a full factorisation is given by 


Hj = 


Hj,e 

H j>0 







(49) 


where u is the number of update steps. The refinement matrix can be expanded 
into a full, invertible two-scale transform by adding independent columns to the 


last factor, thus defining a detail matrix G - = 

factorisation 




^3,0 


in the following 




H, Of 1 

II 

1 - 


H iP G 


L 3,e 
Hj o 


G 


j,° 


n 

Vs—1 


t — 

ln 3 U J 

0 


P f 


0 

In 


Dj 
0 

(50) 

This is an inverse transform, i.e., the synthesis of fine scale coefficients. The corre¬ 
sponding forward transform or analysis is denoted with the matrices H:^ and G :j 
and its factorisation follows immediately from the inversion of (l50l) . i.e., 


0 

In 


I U [q ~ 

±n j U 3 

0 I n '. 

(51) 

For completeness, if Hj t0 has a larger bandwidth than Uj. e , then the full fac¬ 
torisation of the refinement matrix is 


5 

gT 


[0]T 


Hj G^ 


i -i 


DJ 1 


In I 


(u— 1 

n 

Vs=0 


- p 3 


[ 9 -«] 


Dj 
0 

(52) 

The other matrices in the perfect reconstruction scheme (1431) follow from this fac¬ 
torisation in a similar way as in (l50l) and (1511) . 


Hj = 


H 

H 




3,o 


In, 0 
pM I , 

3 n j 


n 

Vs=l L 


In, -m 


p 1 I / 

3 Hr 
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In any case, the last step in the factorisation must be a prediction step. Indeed, 
as follows from Theorem [7] and its interpretation, the factorisation may stop as 

•[«+!] 

*i,e 


soon as Hff H is a diagonal matrix. A diagonal matrix is not required for the last 


H lo 


s+1] 


Theorem [7] also implies to the following corollary (Daubechies and Sweldens, 


1998D . 


Corollary 2 A matrix Hj can only qualify for a wavelet transform if all columns 
in H ]f and Hj 0 are pairwise coprime. 

Indeed, if the evens on a column form a vector which is a multiple of the odds, 
then Dj will contain a zero diagonal element in that column, making it a singular 
matrix. 


3.4 Fast B-spline wavelet transforms 


So far, we have concentrated on finding the matrix Hj in the two-scale equation 
([ITT) for a B-spline scaling basis ‘hj. The actual goal is, however, the design of 
a wavelet transform associated to the B-spline basis. In particular, we want to 
choose an appropriate wavelet basis 'hj in (l38l) . From the wavelet equation (l39l ) it 
is clear that properties of \kj (x) can be realised through an appropriate design of 
Gj. The design of lifting steps (l50l) in Section [3731 has produced a matrix G^ as a 
side effect, but it is unlikely that this matrix realises the exact properties we have 
in mind. In particular, all elements in —Uj and P- are positive or zero, leading 


to the conclusion that Gj contains only non-negative entries. The functions in 

■ i 

. , being linear combinations of non-negative functions with 


*?>) = *;+t 


a non-negative coefficients, cannot possibly have zero integrals. Therefore, in a 
strict sense, these detail basi s functions cannot be termed wav elets. 

The following theorem (Daubechies and Sweldens, 1998) states that all possi¬ 
ble matrices Gj can found one from the other by a single update lifting step. 


Theorem 8 (final update step) Let Sj+i = HjSj+Gjdj be a fine scale reconstruc¬ 
tion with two-scale matrix Hj and detail matrix Gj, and let Sj+i = F/jsj^ + G^ d 
be an alternative scheme involving the same two-scale matrix and the same detail 
coefficients. Both pairs (H j, G j) and (H j, G^) belong to a perfect reconstruction 
( 14 j I ) quadruple of matrices. Then there exists un update operation Uj so that 


Gj = Gf - H,Uj. 


(53) 


As a consequence, the updated scaling coefficients Sj can be found from s 1 - 


dj: 


f = sf 1 + Vjij. 


and 

(54) 
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Proof. The proof is straightforward by the construction 


Uj = Hi Gv 


( 55 ) 


Perfect reconstruction and Definition (l55l) . amount to 


Hj 

G t 


3 J 


H, Gf 


I 


3+ l,e 


Uj 


0 


I 


3 + l,o 



r m 1 

r 


, and to 

[gj J 

Hj Gj 

— 


These two expressions can be combined into 

I i 4-1 p U ■ 


G J 


H i G 3 


1 j+ l,e ^ 3 
0 -(i+l,o 


/ 


J+l>e 


0 



r ht 1 




n n 


[gJ \ 


^ Hj G, ' 

— 

1 

O c 

O < 

1_ 


j+l,o 


which includes (l53l) . □ 

In the factored wavelet transforms of (l50l) and (I5T1) the operation Uj occurs at 
the end of the forward transform. Formally, this adds 


1_ 

Gj, e ' 


1 

QD 

1 _ 

1 - 

1 

1_ 

Hj,o 

j? 

O 


— 1 

CD 

O 

_1 

-1 

0 

_1 


to (f50l) and, obviously, 


’ Hj.e 


1 

■t-) 

>-df 

I_ 

1- 

S’ 

CD 

1_ 

l- 

CD ’ 

_ Hj, 0 

O 

■<>> 


1 

O 

In' 

0 J 

O 

&T 

_ 1 

& 

1 _ 


(56) 


(57) 


to (15Tb 

Unlike the lifting steps in Section [33l the matrix Uj can have more than two 
non-zeros in each of its columns. 

Given a matrix cj , for instance from the factoring in Sections 13.31 and 13.11 
the design of the final wavelet matrix Gj proceeds through the design of the final 
update Uj. A typical design property is to impose vanishing moments, that is 
f™ 00 il>j,k(x)x m = 0 for all j and k and for m = 0.... . p. Define the moment 
vectors of I'/ix) and T. ; (.x) as 


Mj 


O 


3, m 



<f 'J(x)x m dx, 
T'J (x)x m dx, 


(58) 

(59) 


andletM,| p) = 


Mj -0 Mj. 1 ... M J;p _ 1 , and similarly for 0^\ As = 


< hj_i_i(x)Gj° , the preliminary moments can be computed throughout the lifting 
scheme culminating into the expression 0' ,;l ° = Similarly we find 

mJ p) = HJmJ^ v As the final update defines the basis function T'j(x) = <&j + \(x)Gj = 
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(x) — <&j(x)Uj, we have Oj = — UjMj P \ Imposing that Oj = [0] 

amounts to the equation 

of m = UJm^. (60) 

For a wavelet in the strict sense of the definition, we need to impose the van¬ 
ishing moment condition for p = 1. Higher vanishing moments are often not so 
useful in statistical applications, and if they turn out to be beneficial, then often 
it is a better idea to optimise for the wanted benefits in a more explicit way. In 
particular', for use in statistics, it is better to explicitly impose that the transform is 
as close as possible to being orthogonal. All this is further developed in Section [5J 


3.5 The non-decimated B-spline wavelet transform 


The non-decimated wavelet transform is a redundant data decomposition that has n 
wavelet coefficients at each scale, where n is the size of the input vector. Moreover, 
if the elements of the input vector is shifted, then the coefficients at each scale are 
shifted in the same way. This is the translation invariance property. Translation 
invariance is impossible in a decimated transform. Indeed, the decimation takes 
place in the the even-odd partitioning. As evens and odds play different roles in the 
subsequent lifting steps, a shift in the input vector leads to a different role for each 
input element, and thus a different outcome. The fact that shifted inputs lead to 
outcomes with different values may, for obvious reasons, complicate the analysis 
or processing of the wavelet coefficients. 

Each step in the non-decimated transform starts from n scaling coefficients at 
scale j + 1 and produces n scaling coefficients at scale j together with n wavelet 
coefficients. At the finest scale, i.e., for j = J — 1, the decimated scaling coef¬ 
ficients fill up \n/ 2] values of the non-decimated expansion, while [n/ 2j values 
of the non-decimated wavelet coefficients at scale J — 1 come from the decimated 
transform. In order to complete the other half of the coefficients, the same trans¬ 
form is earned out switching the roles of evens and odds. This can be realised 
by defining the shifted knots x^j\ = while we set xf' h = x.jj. = Xk for 

the original vector of knots. After the first step, the shifted vector of knots has 
generated an alternative decimated set of knots, which contains the evens of the 
shifted vector, that is, the odds of the original vector. We denote the new vec¬ 
tor at scale J — 1 by XjL 1 k = Xjl 2k = xj 2k+ \ = ^ 2 fc+i> while the original 


decimated vector is now denoted as Xj _, k = Xj 2k = xj t 2 k = x 2 k- Both vec¬ 
tors can be shifted for use in the second step, thereby defining two more vectors 


Jt] 


^J-l,k ~ X J-l,k-l ~ x J,2fc-2 allu ^J-l,k ~ -V-l,fc-l “ 

vectors are used in the second step, leading us to scale J — 2. 


= X; 


and 


_ [3] _ [2] 


= X 


X J2k- 2' AH fo LI I 
In general, the non- 


decimated transform at scale j consists of 2 J 3 decimated transforms, each defined 
by a vector of knots x [ * k +b] = x [ f +l)2k _ 2b = x {k _ b)2 j- j+rem{at2 j- j+ iy 
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4 The experimental approximation error of a spline wavelet 
decomposition 

The following subsections discuss the accuracy of approximation in a B-spline 
wavelet basis on irregular knots. The importance of the approximation error in 
statistical applications lies in the fact that approximation error is a source of esti¬ 
mation bias. From the discussion below, it turns out that the common practice in 
wavelet based noise reduction to take observations as tine scaling coefficients can 
not be adopted when the observations are not equidistant. 


4.1 Linear approximation error in a homogeneous dyadic refinement 

Let f(x) be a smooth function on [0,1], i.e., a function with at least p continuous 
and bounded derivatives, or a function which is uniformly Lipschitz-a continuous, 
with a > p, as defined in ( MallatS. 2001 . Definition 6.1). This function can be ap¬ 
proximated by a linear combination fj(x) of the B-spline scaling functions at level 
J defined on the equidistant knots xj^ = k2~~ J , leading to an expansion as in ©, 
substituting j by J. As in Section 13.31 index J refers to the finest scale, this is the 
scale at which the observations take place. It can be proven that both the pointwise 
approximation error f(x) — fj(x) (ISweldens and Piessensl.ll994 , (3.5) and (3.9)) 
and the mean squared error ||/(x) — fj(x )|| (Strang and Nguyen, 1996, Theorem 
7.5, page 230) are of the order O (2~ JpS j . The approximation fj(x) is not unique. 
It can, for instance, be defined in accordance to a subsequent wavelet decomposi¬ 
tion that is applied to the approximation. In principle, the wavelet transform of the 
approximation does not interact with the construction of the approximation at finest 
scale. Nevertheless, each wavelet system tpj ^x) fixes a dual scaling basis <pjk{x) 

which can be used to define a projection fj(x) = Sfclo 1 {f( x )> VJ,k( x )} VJ,k{x). 
Alternatively, if tpj ; k(x) does not form an orthogonal basis, then fj(x) can also 
be the orthogonal projection onto the basis <pj t k(x ). Suggesting yet another possi¬ 
bility, similar approximation results are also av ailable for interpolating splines on 
regular point sets ( Dubeau and J. Savoie . 1995b . In all these cases, we conclude 


that a resolution of 2~ J is enough to obtain an accuracy of 2~ Jp , thanks to the 
smoothness of the function f(x). To the best of my knowledge, nearly no such 
results exist for irregular knots. One of the difficulties is that little is known about 
the dual functions ^j±{x). On regular knots, all these functions are translations 
and dilation of a single dual father function, allowing us in a fairly easy way to 
establish a general upper boundfor the approximation error. 

The accuracy of order 2~ Jp for resolution 2 ~ J does not hold for the approxi¬ 
mation obtained by taking function values as scaling coefficients, i.e., 


2 J —1 


fj( x ) = f( x J,k)VJ,k( x )- 


(61) 


k=0 


Instead, it can be proven (ISweldens and Piessensl Il994l. Theorem 2.4) that fj(x 
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is within an L 2 -distance 0(2 Jp ) from the blurred function 


2 J — 1 


fj(x) = H VJfl{xj,k)f{x-xj tk ), 
k=0 


( 62 ) 


whose approximation error, it its turn, is of the order || f j(x) — /(x) lb = ® (2 




at least if the ipj tk (x) are normalised so that J2l=o VJ,k{x) = 1. The blurring 
effect thus neutralises all benefits from the p vanishing moments for linear ap¬ 
proximation of smooth func tion by a refinable basis. This phenomenon is known 
as the wavelet crime ( Strang and Nguven . 199611 . If, however, the scaling func¬ 
tions are interpolating in the sense that ipo,o{k) = 0 unless k = 0, then the 
approximation accuracy is restored. This is the c ase for the wavelets that fol¬ 
low f r om the Deslauries-Dub u c refinement scheme dDeslauriers and Dubuc , 1987 , 
19891: iDonoho and Yul 1 19991 : ISweldens and Schroderl. Il996t) . and the advantage 
is preserved if the Deslauries-Dubuc refinement takes place on a non-equidistant 
point set. Although Deslauries-Dubuc refinement may suffer in other aspects from 
non-equidistance, the immediate use of function values at the input is an impor¬ 
tant benefit for this scheme. As an alternative for interpolating scaling functions, 
one can impose that the projection coefficients are close to the function values, 
i.e., (f(x),(pj tk (x)) = f(xj tk ) + O (^2~ Jp ^ . This is realised by imposing that 
(. x q , 1 pj, k (x)) = 0 for q E {1,..., p— 1}. If the basis is orthogonal, then cp jkix) = 


ipj.k ( x), and the development of the condition leads to the class of coiflets (Daubechies, 
19931) . 


4.2 Nonlinear approximation, compression and thresholding 

When f(x) contains jumps, cusps, or other singularities, any approximation as 
in ([7]) may have a local error of order 0( 1) near the singularities. More pre¬ 
cisely, if the interval [xj tk ,xj tk + 1 ] contains a singularity, then for any point x € 
[xj )k ,xj yk+ 1 ], the pointwise approximation error is | fj(x) — f(x)\ = 0(1). As 
the location of the singularity is only known up to the resolution of the observa¬ 
tion, the local error contributes to the total L 2 -error at a rate equal to the resolution 
of the observation, no matter how accurately the smooth parts in between are ap¬ 
proximated. Therefore, when f(x) is piecewise smooth, then the resolution of 
observations, J, is often taken much finer than necessary for the application. Next, 
a wavelet decomposition is applied and all fine scale detail coefficients up to a level 
L are omitted, except for those that correspond to the singularities (typically the 
large ones). Taking J > Lp ensures that the error in catching the singularities does 
not exceed the error from the smooth part approximation. 

In this nonlinear thresholding scheme, the error from using function values as 
scaling coefficients at scale T J has to be compared to the smooth approximation 
error of 2~ Lp . Setting J > Lp thus also ensures that function values as finest scal¬ 
ing coefficients poses no problem, at least if the thresholding in between causes no 
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additional error. In particular, if f(x) is a polynomial, then all wavelet coefficients 
of a proper approximation fj(x) are zero. In that case, the detail coefficients of 
f j(x) are also zero, because of the following result (Mallat, 2001, Theorem 7.4, 
P-241). 


Theorem 9 Let ■■fix) be a father scaling function. Then for ail q E {0,1,... , p — 
1 }, the function 

vi p ' q \x) = ^2 k q cp(x — k), 

kdTL 

is a polynomial of degree q if and only if for all q E {0,1,..., p — 1}, there exist a 
sequence of coefficients xso that 


x q = 22 x ^k ’ q ^T( x ~ k). 
fees 


This result has the following interpretation: if the basis consists of translations 
of a single father function along equispaced knots, and if the basis reproduces 
polynomials up to degree p — 1 , then the coefficients for the decomposition of 
that polynomial can be found as the evaluation of another polynomial in the knots. 
Conversely, if the function values of any polynomial used as coefficients lead to a 
polynomial, then all polynomials can be reproduced within this basis. 

Replacing the scaling coefficients of a polynomial by the function values thus 
defines a new polynomial, whose detail coefficients are also zero, at least if the 
scaling functions have p vanishing moments to reproduce polynomials up to degree 
p— 1. Thresholding up to level L introduces then no additional error. This motivates 
the p vanishing moments in compression and denoising, even if at the finest scale, 
they are not exploited when function values are plugged in as scaling coefficients. 
As a conclusion, on an equidistant set of knots, and in a nonlinear wavelet method, 
the wavelet crime can be forgiven. 

4.3 Nonlinear approximation on non-equidistant knots 

In an approximation using non-interpolating scaling functions, such as B-splines, 
on a non-equidistant set of knots, the wavelet crime is unforgivable. It has two 
unpleasant effects that do not occur in the equispaced case. First, the approxi¬ 
mation eiTor may propagate towards the coarser levels. Indeed, Theorem [9] is not 
applicable. From (fT2l) . it can be seen that the coefficients representing the identity 
function on the grid of knots cannot be retrieved as the evaluation of a polyno¬ 
mial in the knots. Conversely, when we take the knot values x.jj. as scaling coeffi¬ 
cients, they will not be recognised as coming from a smooth function. We can write 

xj k = x j^ + c./j,, where the error term sdepends on the configuration of the 
knots. Thresholding the detail coefficients that follow from such errors £jj- may 
have an error reducing effect. Depending on the positions of the knots, it may also 
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lead to an increase of the approximation error up to the order of O(Al), thereby 
undoing all benefits from <pj f.(x) having p vanishing moments. The second un¬ 
pleasant effect of taking function values as scaling coefficients is a visual one: it 
leads to a decrease in smoothness of the reconstruction at the fine level. Indeed, the 
approximation fj(x), reflects the irregular spacing of the knots xj±, in a way sim¬ 
ilar to a decomposition and reconstruction that would ignore the non-equidistant 
locations, assuming that xji- = k/2 J . 

Therefore, before carrying out the actual wavelet analysis, we need to find a 

FI 

vector of scaling coefficients sj for which f(x) = <bj (x)sj. As we are given the 
function values ./'(I'j.fe) for k = 0,..., n — 1, we have to solve the set of equations 

r~i 

f(xj,k) = Tii=o sj,i<PjJ(xj t k). In matrix notation, this is fj = &jsj, where 
fj = [..., f(xjj : )....] is the vector of observations and where the matrix <& / has 

F 

entries = A,/ / ( x ././.-)■ For the sake of readability, the superscripts are omit- 

F 

ted in the notation of the matrix. The evaluations are earned out using 

the recursion Q in Definition |2] All B-splines are zero in the first and last knots, 

F 

i.e., ipj\(xj t k ) = 0 for k G {0, n — 1}, and for alH € {0,..., n — 1}. So, the first 
and last row in <1>j contain zeros only, and no vector sj can reconstruct the values 
f{xj,o) and /(xj iTl -i). In order to be able to reconstruct /(xj, 0 ) and f{xj, n - i), 
we add two artificial knots left and right from the interval [xjp, xj. n - \ ]. By adding 
the two columns for the additional corresponding B-splines, but not the zero rows 
for these new end knots, we get an n x (n + 2) matrix T> / and a vector sj of 
length (n + 2). The system j = fj being indeterminate system, we look for a 
solution that behaves well if fj is observed with noise. For reasons of superposi¬ 
tion, the transformation s j = Sjfj should be linear. There is no need for a noise 
reducing effect in the transformation, since we want to keep the noise reduction for 
subsequent wavelet analysis, which is much better equipped for the task, especially 
when f j has singularities. On the other hand, independent, homoscedastic noise 
should stay more or less homoscedastic. Therefore, the matrix Sj should be as 
close as possible to the identity matrix. In particular, its singular values should be 
close to 1. This would motivate to take Sj = f I> j (' ■ Simulation studies 
show, however, that even in this optimal case, the singular values are much too 
large for use in practice. Moreover, the outcome sj would be an exact solution 
of the indeterminate system, but not necessarily the solution that the subsequent 
wavelet analysis expects. In particular, suppose that fj comes from a polynomial 
of degree less that p, then the wavelet detail coefficients should be zero. This hap¬ 
pens only if the operation Sj would deliver the right power coefficients, as given in 
Theorem [3] There is no guarantee that this would happen. For these two reasons, 
no exact solution of the system &jsj = fj is satisfactory for use in practice. 

An approximative solution can be found using some sort of regularisation. 
Tikhonov regularisation would minimise /j|| 2 +a||sj|||, for some appro¬ 

priate value of a. As the objective is to control but not to reduce the noise variance, 
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the value of a would be smaller than in ridge regression. While the regularisa- 
tion controls the variance effectively, there is no control at all on the propagation 
of approximation error, i.e., the bias, through the subsequent wavelet decompo¬ 
sition. The same remark holds for other general regularisation methods, suc h as 
the Landweber iteration schem e for ill posed linear inverse problems (ILandweberl. 


1951; Daubechies et ah, 2 00 4). 


A good compromise for Sj should not be too restrictive, i.e., it should not im¬ 
pose the perfect reconstruction = f j for any vector fj. Instead, we focus 

on the power functions. Let X® denote the matrix with entries = x q Jk 

for q = 0,... ,p — 1 and let X^" the matrix with the corresponding coefficients in 

the fine scale B-spline basis, i.e., {X^)k A = Xj’{^■ Then imposing 


SjX® = x®, 


(63) 


ensures that all polynomials of degree less than p are represented exactly and with 
the coefficients that are recognised by the subsequent wavelet transform as coming 
from a polynomial. Expression (l63l) formulates p conditions for each row k of 
Sj. Let dk denote the set of indices l for which we choose Sj-i-j / 0, then 
taking #dk > p + 1, allows us to satisfy these constraints, leaving one or a few 
free parameters to control the variance of the scaling coefficients. In particular - , an 
objective can be to take Sj as close as possible to the identical transform, extended 
with two zero columns for the two additional knots at the end points of the interval. 


n j 


0 , 


We minimise the Frobenius norm \\Sj — /./ 1 | f, where Ij = 0 r 
the level of the fcth row, this amounts to the constraint optimisation problem 


.On 


subject to 


for q = 0,... ,p — 1. 


min ||5j;fc i( 3fc ^fe+i,9fe||2> 

^ J\k,dk 

e =sfr 1 , 

l£dk 


(64) 


(65) 


5 Estimation in a B-spline wavelet basis 


The spline wavelets proposed bv lcohen. Daubechies. and Feauveau (1992) devote 
all free parameters in Gj to vanishing moments p. This corresponds to a lifting 
scheme where the final update step is found by the system in (l60l) . Imposing one 
primal vanishing moment, i.e., (l60l) with p = 1, is necessary to have wavelets in 
the strict sense of the word, basis functions that fluctuate around zero so that their 
integral is zero. Basis functions with zero integral are indispensable for the rep¬ 
resentation of square-integrable functions. Indeed, when the basis functions have 
nonzero integrals, there exist nontrivial approximations of the zero function that 
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converge in quadratic norm (Jansen and Oonincx, 2005, page 93). Basis functions 
with nonzero integrals are useful on subspaces of the square-integrable functions, 
defined by additional smoothness conditions. These conditions exclude functions 
with jumps, which are typically the functions of interest in a wavelet analysis. 

On the other hand, the experiment in Figure [Q illustrates that using all free 
parameters for a maximum number of primal vanishing moments may not be the 
best choice in a context of function estimation. This holds in particular on non- 
equispaced data, irrespective of the wavelet family. For spline wavelets, it holds 
also for data observed in equidistant points. The experiment suggests that specific 
design criteria are necessary for wavelets in statistical applications. 

The experiment starts from nj = 1000 fine scaling coefficients that are inde¬ 
pendently and identically distributed random variables with zero mean ej. Using 
W from (1441) . define the transformed random vector r// = Wej. The wavelet co¬ 
efficient vector r)L consists of coarse scaling coefficients plus detail coefficients at 


each scale, r= 


,T 


Si 


Si- 


Let Dl be the linear diagonal selec- 


~L U L ■ ■ ■ U J-1 

tion operation that keeps the coarse scaling coefficients and discards all detail coef- 

T 


ficients, i.e., Dltjl = 


Ol 


Oj-i, 


and consider the reconstruction 


WD L r, J L = ( WD l W)ej = ( WD l D l W)sj = (W L W L )ej. Here W L = D L W 
and Wl = WDl- Note that 


J—L 


W L = n Hj-i = -Hj-i-Hj-2 ... H l+ 1 H l , 


( 66 ) 


i= 1 


while 

j -1 

W L = I] Hj. (67) 

j=L 

Since all its steps are linear operations, the reconstruction (Wl Wl)£j is unbi¬ 
ased. All observed errors are due to the variance of the reconstruction. Each plot in 
Figure Q] compares anon-orthogonal projection P l = WlWl with the orthogonal 
Pl± = Wl (Wj Wl )~ 1 Wj . The projection matrices P l should not be confused 
with the prediction matrices in Sections 123113.31 and l3.ll The non-orthogonal 
projection is constructed level by level, where in each level the lifting factorisa¬ 
tion of the cubic B-spline basis is completed with a final update matrix Uj that 
has one nonzero off-diagonal next to a nonzero diagonal. All nonzero elements in 
f7j are filled in by imposing two primal vanishing moments, i.e., () :) , m = 0 j for 
intermediate scales and for m S {0,1}, where 0 J/rn is defined by (l59l ). 

The reconstruction from the Deslauries-Dubuc projection with two vanish¬ 
ing moments shows a large variance on the non-equidistant data, but not on the 
equidistant equivalent. Reconstruction from a projection with two vanishing mo¬ 
ments onto a B-spline basis shows a large variance in both the equidistant and non- 
equidistant cases. Moreover, further experiments reveal that the variance increases 
when the B-spline wavelet transform involves more scales. In the Deslauries- 
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Multiscale B-Splines, equidistant 


Deslauries-Dubuc. equidistant 






Figure 1: Noise reduction using linear diagonal selection in a multiscale cubic B- 
spline basis and in a Deslauries-Dubuc scheme with multiscale cubic interpolation: 
reconstruction with all detail coefficients replaced by zero. Equidistant and non- 
equidistant observations. In grey line the observations, in black solid line the re¬ 
constructions from projections with two primal vanishing moments, in thick black 
dashed line the reconstructions from the orthogonal projections onto the B-spline 
or Deslauries-Dubuc bases. 

Dubuc scheme, there appears to be no such multiscale deterioration. Large vari¬ 
ances in that scheme are rather isolated, although possibly devastating, effects from 
local irregularities in the non-equidistant grid of observations. 

Although the large variances are clearly an effect of the non-orthogonality of 
the transform, the classical numerical condition number sheds no light on the prob¬ 
lem. Indeed, on an equispaced, dyadic grid of nj = 2048 knots, the condition 
number of the forward Cohen-Daubechies-Feauveau cubic B-spline wavelet with 
two primal vanishing moments turned out to be 71.9, and this number is fairly in¬ 
dependent from nj. On the other hand the condition number of a cubic Deslauries- 
Dubuc scheme with also two primal vanishing moments is dependent on nj and 
for nj = 2048 it equals 203.7. Nevertheless, the Deslauries-Dubuc scheme on 
equispaced data shows no problems with large variances. The numerical condi- 
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tion of the wavelet transform is therefore not an adequate description of the non¬ 
orthogonality of the transform for statistical applications. The same conclusion 
holds for the Riesz constants Mallat (2001) in a biorthogonal transformation. 

As in B-spline wavelet transforms the increasing variances across scales oc¬ 
cur also on equidistant data sets, it is possible to describe them using Fourier 
transforms. Given a vector after projection el = P r.£, define its Fourier trans¬ 


form Y(u) = \ £"=i £l,-, 




By carefully checking the effect of sub- and 


up-sampling on the Fourier transform, it is fairly straightforward to prove that for 
e uncorrelated and homoscedastic, it holds that 


E\Y(u)\ 2 


1 

2 


J-l 


n 

j=L 


J ~ L -lJ-l 

e n 


k =0 j=L 


H(2\u + kit/2 J ~ L ~ l 



( 68 ) 


In this expression H(ui) represents the Fourier transform of one row of the one step 
matrix Hj . In the equidistant settings, all Hj coincide with a single Toeplitz ma¬ 
trix, for which one row suffices to characterise the complete multiscale transform. 
Obviously, the same definition holds for H(u). 

The Fourier analysis cannot be applied to non-equidistant observations, and so 
it cannot be used to find the best Hj for a given sequence of Hj. Still under the 
assumption that the covariance matrix of e is T, e = a 2 1, we find that the covariance 
matrix after projection equals = ct 2 PlPl- Since Pl?l± = Pli. and 

Pl_l is symmetric and idempotent, it holds that P/ P[ - Pl±PL = Pl(/ - 
Pl_l)Pl- Th e matrix I — P/.i is positive semi-definite, so all diagonal elements of 
P^P^ — Pl±PIp mus t be non-negative, which reads as var(P^e) > var(P£j_£), 
this vector inequality holding componentwise. This conclusion is confirmed by 
the observation in Figure Q] As a result, the variance propagation of a wavelet 
decomposition applied to uncorrelated homoscedastic observations is described by 
the nonzero eigenvalues of P l P[, i.e., the nonzero singular values of P l . As a 
summary, define the multiscale variance propagation as follows 


Definition 4 (multiscale variance propagation) Given a wavelet transform defined 
by the sequences of forward and inverse refinement matrices Hj and Hj, where j = 
L ,..., J—1. Let Wl and Wl be defined as in ifbbi) and d67l) . and let P r = WlWl- 
Then the multiscale variance propagation equals 

kf(Wl,Wl) = |k(P L )|| 2 /V^Z= IIPlIIf/v/aL) (69) 


where <t(Pl) is the vector of singular values o/P l, while n r is the number of 
columns in Wl, which is also the number of nonzero singular values of P r,. The 
notation ||Pl||f stands for the Frobenius norm ofPi,, whereas ||cr(P^)|| 2 = 

cr(P l) t <t(P l) is the classical Euclidean vector norm. 

As an alternative for (l69l ). the multiscale variance propagation can also be defined 
as 

K 2 (W l , W l ) = max(<x(P L )) = ||P L || 2 , (70) 
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where ||Pz, ||2 denotes the matrix norm induced from the Euclidean vector norm, 
which is equal to the largest singular value of the matrix. 

Using the perfect reconstruction property that WlWl = I nL , it can be proven 
that all singular values of Pz = WlWl must be either zero or greater than one. 
As a consequence, it holds that kf(Wl, Wl) > 1, and kf(Wl, Wl) = 1 if and 
only if P l is an orthogonal projection. 

Given the refinement and detail matrices H ^ and G l that result from the fac¬ 
torisation in Section 1231 we want to design a sparse update matrix Ul that min¬ 
imises the Frobenius norm of P /, under the constraint that it also preserves a few, 
say p, primal vanishing moments. The index L refers here to the coarsest scale up 
to the current stage in the design, but L may turn out to be an intermediate scale in 
the eventual transform. 

We first fix which elements of Ul will be nonzero. The number of non-zeros 
must be large enough to cover the p vanishing moments and allow a further optimi¬ 
sation. For a given element UL-, r ,s this means that we either choose it to be zero, or 
we optimise its value. We therefore compute the derivative of the Frobenius norm 
with respect to Since WlWl = Wl{H^ T + UlGI)Wl+ 1 , we find that 


d{W L W L )ij 

dU L - r ,s 


d(W L ULGlW L+ i)i d 

dU L - r ,s 


WL;i,r{G T L W L+ l)s,y 


This we use in 

d\\W L WL\\ 2 F 

dUpr.s 


dYZ=iEU (WlW l )?j 

dU L -r,s 


i= 1 3 = 1 


d (W L W L )ij 

dU L ,r,s 


n n 

2 E E w W;riWrWiX 3 (GJWl+i ) s ,j = (wIw l Wl(G t l W l+ i) t ) r s 

i=lj=l 


In this expression, the matrix Wl = (//T UlG j ) Wj j + 1 depends on the update 
matrix Ul that we want to optimise. 

Most applications require at least one primal vanishing moment. Therefore, 
the moment equation (l60l ) is imposed as a constraint in the optimisation process. 
Using a vector of Fagrange multipliers for each moment, the objective function 
Kl{Ul , A L-m ) can be written as 


_ p -1 

K L (U L , \ L ,m) = \\W L Wl\\ 2 f + E X l,m(0 L -,m ~ U^M L -,m)- 

m =0 


Given a choice 1 L = {(r, s) € {1,.. .,n L } x {1,.. .,n L +i~ n L }\U L - r ,s / 0, the 
njj L = #Il equations for the optimisation follow from = 0, 


(Wl W L U l G t l Wl+iWI + 1 G t l ) 


~ _ p -1 

(wIw l H [ 1 ]t W l+1 wI +1 G T l) s +E A L;m;s M L;m;r 

r,S m= 0 

(71) 
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Figure 2: Singular values in descending order for projections WlWl onto a coarse 
scale B-spline basis, using several forward transforms Wl- In all examples, v = 4 
and p = 2. The orthogonal Wl has no band structure. All other Wj J are quadri- 
diagonal, except for the upper curve, marked with the A signs, which corresponds 
to a bidiagonal Wl- 


These equations are completed by (ul+i — til)p moment equations (l60l) . 

The set of non-zeros in Ul is fixed by the user before the optimisation is carried 
out. For obvious reasons, its cardinality nu L should be large enough to cope with 
the moment constraints, i.e., njj L > (ul +i — ni)p- The easiest option is to let Xl 
be the collection of v main and side diagonals. The diagonals have lengths equal 
to n L + i - n L , n L - 1 , n L + 1 -n L -l,n L - 2, n L + i - n L - 2, ..., summing up 
to n Uh = [v/2\ n L + \v/2 \ (n L + i - n L ) - L*V 2 J \v/2\. Since n L+ 1 - n L is 
the number of odds at scale j + 1, we have til +i — til € {til — 1, til}, and so 
nu L = v{n L +\-n L )-\yI2\ \u/2\+r L \y/2 \, where r L = 2n L -n L +i € {0,1}. 
It follows that njj L > (n-L+i — n-L,)p is not possible for v equal to p, unless p = 1. 
In other words, due to boundary effects, p vanishing moments require more than p 
diagonals in the final update. 

As an example, depicted in Figure |2j we compute the singular values of the 
projections WlWl using cubic B-splines, for the analysis of data that are de¬ 
fined on nj = 1000 fine scale knots. The fine scale knots were drawn indepen¬ 
dently from each other from a uniform distribution on [0,1], After J — L = 5 
resolution steps, the coarse scale has a resolution of til = 32. Therefore, the 
matrices Wl and Wj have 1000 rows and 32 columns, from where we know 
that nmk(WfWL) = til = 32. Figure [2] plots the first 53 singular values of 
the projections in descending order. It is no surprise that the 33rd and beyond 
are all zero. It is no surprise either that for the orthogonal projection, i.e., for 
Wl = (WZW L )- l WZ, all 32 non-zeros are equal to one. The orthogonal projec¬ 
tion matrix Wl is not sparse in the strict sense, although the off-diagonal elements 
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show rapid decay. Therefore, the fully orthogonal projection can be well approx¬ 
imated by a band-limited matrix, using the system of equations in (I7TT ). Figure [2] 
examines the quality of the approximations, showing that a bidiagonal matrix is 
probably too sparse: even if all free parameters are spent in the optimisation of the 
singular values, the largest value is close to 8. Three other alternatives for the or¬ 
thogonal projection use four diagonals. One option is to spend all free parameters 
on primal vanishing moments. The corresponding wavelets ipL,k(,%) a h have four 
vanishing moments, except those near the boundaries of the interval. The singular 
values are, however, not controlled, the maximum being close to 6 in this case. The 
situation may deteriorate quickly in other settings. When all free entries in Ul are 
spent on the minimisation of the singular values, the fourth order approximation 
is ready for use in statistical applications. But even when two vanishing moments 
are imposed in combination with a minimisation of the singular values, these val¬ 
ues are kept low enough for use in statistical applications. Other experiments, not 
shown here, confirm that imposing one or two vanishing moments in a scheme that 
otherwise concentrates on the singular values, is performing nearly as well in terms 
of singular values, as a scheme that has its entire focus on these singular values. 

As a summary for this section, wavelet transforms for use in statistical estima¬ 
tion should be as close as possible to being orthogonal, because reconstructions 
from non-orthogonal decompositions may suffer from variance blow up. Orthog¬ 
onality puts, however, serious limitations on the design of a wavelet transform. 
As an example, orthogonal spline wavelets with compact support do not exist. The 
numerical notions of condition number and Riesz constants in a biorthogonal trans¬ 
form are not sufficiently adequate for the description of the variance propagation. 
The design of the wavelet transform proposed in this section therefore focusses 
directly on the variance and it does so by looking at the combined effect of decom¬ 
position and reconstruction. 

6 Conclusion and outlook 

The first contribution of this paper has been to extend the construction of the 
Cohen-Daubechies-Feauveau B-spline wavelets towards the case of non-equidistant 
knots. The new construction is based on the factorisation of two-scale or refine¬ 
ment equations into lifting steps. An interesting topic for further research is to 
investigate the same methodology for other spline wavelets. 

The second contribution has been the design of a numerical method to find 
fine scaling coefficients from function values, in a way that performs well when 
the function is observed with noise. Future research could improve the method by 
making it adaptive to jumps or other singularities. In the neighbourhood of jumps, 
the benefit from a perfect reconstruction of power functions is limited, so a local 
relaxation of these conditions may yield sharper reconstructions. 

Finally, the third contribution has been a modification of the Cohen-Daubechies- 
Feauveau wavelet transform for specific use in statistical applications, making sure 
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to control the propagation of the variance on the wavelet coefficients at succes¬ 
sive scales. The focus on the variance propagation in the design of the transform 
allows us to relax the stringent orthogonality condition and to construct a basis 
of compactly supported spline wavelets in which non-linear processing leads to a 
reconstruction with a well controlled bias-variance balance. 
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Appendices 

A Available software - reproducible research 

All transforms presented in this article have been implemented in the latest version 
of ThreshLab, a Matlab®software package available for download from 
http://homepages.ulb.ac.be/~ma jansen/software/threshlab.html 
The forward and inverse B-spline wavelet transforms are earned out by the 
routines FWT_2Gspline . mand IWT_2Gspline . m. Several alternatives for the 
retrieval of appropriate tine scaling coefficients from noisy observations, explained 
in Section HJ are implemented in f inescaleBsplinecoef s .m. The use of 
this routine is illustrated in illustratef inescalecoef s . m. 

In particular, the experiment in FigureQ]is set up in the routine illustratevariancepro jection . m. 
The singular value plots in Figure [2]have been generated using illustrate_updateLSapprxprimmom. m. 

B Proofs for Theorems I2ll3l and |4] 

The proofs are given for the sake of self-containment. The results are well-known 
in literature. 

B.l Proof of Theorem [2] 

The summation in ([5]) contains rij — [p/2] — [p/2\ = rij — p B-splinc functions. 

All B-splines have mutually unequal supports, and are thus linearly independent. 

On the other hand, Ij consists of rij — 2p + 1 subintervals [xyj,, x :j j,.+ \), with p 
degrees of freedom on each of them. The continuous derivatives in each interior 
knot consume (rij — 2p) x (p — 1) of these degrees of freedom, leaving us with a 
vector space of dimension rij — p. □ 
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B.2 Proof of Theorem |3] 

2. Expression CD follows from ©, applied for fj(x) = x q . The left hand side 
of © can then be written as 


q — 1 ~[p— 1)9—l] [p—l] / \ 

qx q =q^x' j k _ 7 , Vj- 


fcGS 


whereas the right hand side becomes 


qx q 1 = 0 - i) E 


-[p,q] ~[p.?] 




'j,k -1 P-11 / x 

-- 


fees x i,fc+ [p/ 2] -i [p/2j 

Identification of the terms in the decomposition leads to (fill) . 

3. Next, it can be verified that all solutions for xf’^ in CD must satisfy 


_ 

X j,k ~ 


p — 1 


[P/21-1 

E 

*=i- Lp/ 2 j 




p — 1 


M- 1 ril 

/ , x h% -l- 0 • 

i=l-|//2j 


On the other hand, x 


' - must be independent from the p — 1 knots Xj^ 


j,k 


around Xj t o if k > p. So take x hl symmetric around Xj t o = 0. Then, 
obviously, 

1 r ?/ 2 !- 1 

E^ ^ = °- 


p 


i=l-[p/2j 


On the other hand, Sj 0 = 0, as the corresponding basis function is even. 
This leads to (fT2l> . 

4. The proof for (IT3l) is similar to that for (fl2l) . following an induction argument 
on p. 


□ 


B.3 Proof of Theorem |4] 

Consider x € (xj t k, Xj 7 k+ 1 ) C Ij, then x € Sj^ m l m < k + 1 and r m > k 
k — [p/2] + 1 < m < k + [p/2j . Then in x, it holds that 


1 x x 


r*P 1 


[?] / x [?] / x 

^-P/21 + lW ■■■ ^+L?/2j (l) J 


X, 
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where X m _ k+ \p/‘ 2 \ q +1 = x jm • ^ can be verified that X is non-singular, so we 
find an expression for the basis functions on the subinterval (xj^, %j,k+ 1 )> 

I?] / N [p] / \ 

Ki-M+iW ••• 

all other basis functions being zero on this subinterval. It follows that on each 
subinterval, the basis functions must be polynomials, and the polynomials are 
uniquely defined by (l72l) . As from Theorem [3j we know that the power func¬ 
tions have the coefficients xf’^} in a B-spline basis, the polynomials defined by 
(1721) must coincide with the B-splines on that interval. □ 


1 x x 2 ...x p 1 


X 


-i 


(72) 


C Constructive proof for the factorisation in Theorem [6] 


The factorisation is based on the approach in Daubechies and Sweldensl ( 1998 ). 
Because of the non-equidistant knots, it proceeds column by column in the refine¬ 
ment matrix. The proof also makes use of the band structure of a matrix. For the 
sake of simplicity, and without any impact on the applicability of the argument, we 

FI [si 

ignore occasional zeros within the nonzero band of H - ' e or H - Q . In each column 
of these matrices, we take the same number of rows into consideration, equal to 
the bandwidth of the whole matrix, even if that particular row has actually less 
non-zeros. 

FI FI 

Consider first the case where H - e has a larger bandwidth than H - This is 
only possible if the first and last nonzero in each column of Hj is situated on an 
even row. Denote by 2Aq (£) the first nonzero row in column £ of Hj and by 
the last nonzero row in the same column. Then the Cth column of (1221) reads as 


H 


j, 2k,£ 


= H 


F+i] 


j, 2k,e 


k 2 (i) 

E 


m=k\(£) 


rr[ s +l] rrF] 

u jjkjm 12 j,2m+\,i > 


(73) 


We assume that k] (£) and k>> (l) are strictly increasing functions, otherwise a more 
careful design of the update step is needed. Let ^(/c) be the inverse of k\(£), i.e., 
(2 (/>' 1 (()) = i. In words, (2 (A;) is the last column on row k with a nonzero element. 
In a similar way, let l\(k) be the inverse of k-)(£). In a given column £, we impose 
for k = k] (£) and for k = k>>(£) that H^k \ = 0, so that the number of non-zeros 
in column l of equals £2 (£) — k\ (£) — 1. Note that the number of nonzero 

rows equals k 2 {£) — k\(l) + 1 for H^\ and ) — k\{£) for H^\. For k = k\(£), 
and for k = k’ 2 ((% we obtain the two equations of the form 


H 


j,2k,£ 


k 2 (e)~ 1 

E 


m=ki (£) 


rr[ s +l] rrl 5 ] 

u j^kjm 12 j,2m+1,1' 


(74) 
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These two equations contain as unknowns two partial rows in matrix U^ +l \ In¬ 
stead of solving the two equations for fixed £, we consider the two equations for 
given k, by looking at £ = and £ = £\{k). 

For k = k\ (£), i.e., £ = we set = Ofor all m = k+ 1,..., k^zipiik)) — 

1, while for m = k, we must then take 


U, 


>+i] 


j,k,k 


= -H 


>] 


j,2k/ 2 (k) 


/ wM 


(75) 


in order to satisfy (l74l) for k = k\ (£) or £ = l 2 {k). 

For k = k 2 {£), i.e., £ = £\{k), we set C/j 5 ^ = Ofor all m = k\{£\{k )),..., k— 
2, while for m = k — 1, we must then take 


rr[ s +( __rrM /rrM 

u j,k,k-l — I1 j, 2 k,e 1 (k)/ £1 j, 2 k-l,e 1 {k)’ 


(76) 


in order to satisfy (1741) for k = k 2 {£) or £ = ^i(fe). 

Once the diagonal and the lower diagonal of has been found, all the 

other entries of this matrix can be filled with zeros. This leaves us with k 2 {£) — 
k\(£) — 1 equations of the form (1731) . Each of these equations allows us to find 
exactly one element in column £ of 

\s] fsl 

The case where the bandwidth of Hj is larger than that of H-‘ e is treated in 

fsl fsl 

a similar way. The case where the bandwidths of iTjand H;- n are equal can be 

fsl 

reduced to the first case if we artificially increase the bandwidth of 77j ‘ e , by taking 

fsl 

an additional zero into account in each column of H\ ' □ 


D Proof of Proposition Q] 


First, it can be checked that (1251) holds for any p and for q = 1 and for q = p — 1. 
More precisely, a bit of calculations show that from (fl2l) . it follows indeed that, 


r(pd4l] 

Z j,k 


tj,k+p -1 tj,k +1 
tj,k+p~ tj,k 


( ^ \ 

I t 3,k + t 3,k+ i + t j,k+p I 


In a similar way, starting from (fT3T) . we arnve at 


Pp , p - 141] 
f j,k 


^j,k+p— 1 tj,k+l 

^ j,k+p tj,k 


I tj,k ' n t],k+i ■ tj,k+p I 

V i=2 / 


(77) 


(78) 


As a result, for p = 1, 2 the result holds for all q = 0,... ,p — 1. This is the 
basis for the subsequent induction argument. 

Using the recursion in (fl4l) . we can find the coefficient in two steps. 


4p,q,Lla] 
l j,k 


First we define ffT” J as the coefficient that results from taking out tj^+i from 
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7 M 


t 


j,k 


and replacing it by tj This intermediate coefficient equals 


4p,g,Lla] _ 4p,g] q ~{p-l,q-l] . n, 

~ *• "r V 6 J,A: 4,fc+lj • 


j,k L j,k 

We define a similar coefficient for p — 1 and q — 1. 


4p-l,g-l,Lla] _ 4p-l,g-l] q— 1 4p-2,g-2] 




'j,k 




+ |z2 t & + i 

P Z 


~m,g,Llal ... 

t L j k J and replace it with 

apply (fl4l) . we need the power coefficient of degree p — 1 for q — 1 based on all 


Next, we take out , from t- k 


In order to 


. . . ~m— l,q— l,Lla| 

remaining knots in t- k . This is exactly t - k 


^p,q,Llb] _ ^[p,(j,Lla] 


tj,k 


j,k 


+ p-l ^ 


. We thus find 

0 


J-'j,k+p tj,k+p- 


Using the expressions above for t- k 
developed into 


4p,q,Llb] _ 4p,<j] q 
l j,k - tj,k + 


P-1 [ 


4p,(j,Lla] 

l j,k 

4p-l,q—l] 


l,q— l,Lla] 

’ t j,k 


and for t :' L . ’ 1 " ' 1 , this can further be 


j,k + tj,k+p tj } k +1 tj k+ ~_ i) 

+ ~_ 9 ^',fc+i ^ (X?4 — i i4+i)(^',fc+p — 

P Z 


(79) 

Expression (l79l ) can be substituted in the right hand side of (1251 ). We now work 


on t 


q{p,q,Ll] . 


j,k 


in the left hand side, starting from its definition in ([I]), and using once 


more the recursion (fl4l) . We find 

xjp.eT 1 ] _ U,fc+p-1 — 4,fe+l x[p,g] 
tjk — , , tj k 

t j,k+p ~ tj,k 


P 1 ^j,k+p tj,k 


4jfc+1 ^.7'jfc+P *J,fc+l)(*J,AH-l *j,fc+p-l)(*j,fc+p ^j,fe+p-l) 


For the last factor in this expression, we use again the recursion (1 1 41 ). this time for 

-4p-l,q-l] _ 4p—l,q—l] q- 1 4?-2,<?-2] / _ 

4,fc+i - t g_2 W V44+p-i Tfc+1 

Substitution of this recursion, followed by straightforward algebraic manipulation, 
amounts to (l25l) . thereby completing the proof. 
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